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PREFACE 


As a part of an on-going reliability validation research program, NASA Langley 
Research Center sponsored a Sub-Working-Group Meeting on the Production of Reliable 
Flight-Crucial Software. This meeting, which was held at Research Triangle Institute 
November 2-4, 1981, specifically addressed the state of the art in the production of 
crucial software for flight control applications. It provided a forum where research 
ers communicated their ideas about how to develop highly reliable software and high- 
lighted problems associated with reliable software production. 

Meeting objectives were to survey the state of the art and identify areas where 
additional research is needed. A more specific objective of the sub-wof king-group 
meeting was to obtain answers to the following questions: 

1. Is it meaningful to associate reliability metrics with software? If so, 
what are these metrics and how are they to be computed? 

2. How good are the classical methods used in the conventional software devel- 
opment cycle? Are they adequate for building crucial software assuming a 
composite set of quality metrics was defined? 

3. Are the more modern formal methods of building software sufficiently mature 
that they could be applied during the production of reliable software for 
digital flight control systems? 

The consensus was that it is meaningful to associate rel i abi 1 ity metrics with 
software. However, the precise nature of these metrics needs to be determined. 

-9 

Classical methods are inadequate for achieving a failure probability of 10 for 
a 10-hour flight. It was suggested that employing an eclectic set of complementary 
techniques constitutes a feasible near-term solution using classical methods. This 
approach should yield a substantial improvement in the reliability of a given soft- 
ware system. 

Some formal methods are approaching feasibility for production use. Technical 
advances in the manageability of these methods must occur prior to their adoption. 

The meeting format involved brief and informal presentations followed by discus- 
sion. The earlier sessions considered conventional approaches to reliable software 
development while the later ones focused more on reliability measurement and the more 
formal methods. All presentations addressed the state of the art of the methodology 
under consideration. A general discussion of the main problems and research needs 
was held in the latter part of the second day. 

Each meeting participant submitted a prioritized list of three short-term and 
three long-term research needs. The results of this prioritization activity indicat- 
ed a short-term need for research in the areas of tool development and software fault 
tolerance. For the long term, research in formal verification or proof methods was 
recommended. Formal specification and software reliability modeling were recommend- 
ed as topics for both short- and long-term research. Recommendations for research in 
elude the use of the NASA Avionics Integration Research Laboratory (AIRLAB). 

This sub-working-group meeting on the production of reliable software was con- 
ceived and sponsored by personnel at NASA Langley Research Center, in particular 
Billy L. Dove and A. 0. Lupton. 
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1.0 INTRODUCTION AND OVERVIEW 


1.1 Problem Motivation 


A computer application is termed crucial if failure could endanger human life. 
An example is a full-time, full authority digital flight control system for commer- 
cial air transport. Present commercial aircraft use mechanical and hydraulic link- 
ages and analog controls in flight-critical applications. The next generation of 
aircraft is expected to use digital flight controls and digital communications be- 
tween the control system and the control surface actuators. Research in this area is 
important since there is the potential for substantial fuel savings and improved air- 
craft performance associated with entirely digital control systems. 

Crucial applications demand high reliability as well as validation that the re- 
liability prediction is meaningful. A requirement of a system failure probability 

of 10"^ for a 10-hour flight has been used as a working figure. This reliability 
requirement is a system requirement and therefore includes system failures result- 
ing from either hardware or software anomalies. A great deal of work has been done 
on systems designed to be tolerant of hardware faults [1,21. However, further work 
is needed to determine how to measure, cope with, or eliminate faults which occur in 
software. 

The problem of software quality has been studied extensively, but usually with 
the imprecise goal of improving quality rather than achieving a certain specified re- 
liability figure. For digital flight control systems to be accepted as suitable 
for commercial use, it will be necessary to show that the required software reli- 
ability has been achieved. 


1.2 Meeting Objectives 

The meeting objectives included surveying the state of the art in reliable soft- 
ware production and identifying research needs. A more specific objective of the 
sub-working-group meeting was to obtain answers to the following questions: 

1. Is it meaningful to associate reliability metrics with software? If so, 
what are these metrics and how are they to be computed? 

2. How good are the classical methods used in the conventional software devel- 
opment cycle? Are they adequate for building crucial software assuming a 
set of quality metrics was defined? 

3. Are the more formal modern methods of building software sufficiently mature 
that they could be applied during the production of reliable software for 
digital flight control systems? 

Research recommendations could include the use of the NASA AIRLAB facility. 

1.3 State of the Art in the Production of Reliable Software 


Present day production of software for crucial systems relies on a balanced al- 
location of a myriad of resources and represents a costly endeavor. The developers 
of the software for the Space Shuttle used accepted technology, review boards, and 
brute force testing to maximize the reliability of the software and their confidence 
in it. Their goal (which coincides with the goals of the IBM Cleanroom project) was 
to produce error-free software. Whether error-free software is attainable remains an 
open question. 



Various software engineering approaches exist which contribute to the relia- 
bility of a software system. The extent of their contributions still needs to be 
quantitatively determined. 

Formal specifications are a critical aspect of highly reliable software and pre- 
suppose a mechanism for determining the equivalence of the specification with the in- 
tent. Technology for addressing this problem does not exist today. 

The available software reliability models require very large amounts of execu- 
tion time to produce accurate estimates if the software is close to achieving a fail- 
ure probability of 10^ in a 10-hour flight. The problems of assuring the reliabil- 
ity of software may be more difficult than those encountered during attempts to pro- 
duce it. 

The reliability of any software depends on the reliability of a considerable 
body of support software (tools, languages, processors, etc.). High-level language 
implementations must be reliable if programs written in those languages are to be re- 
liable. Experimental efforts are under way to collect a set of tools in a unified 
system for programmers' use. Systems which will assist management with administra- 
tion of a software project are also under development. 

1.4 Summary of Results 

The overriding group consensus was that the currently stated reliability re- 
quirements for software alone cannot be achieved or confirmed with current technol- 
ogy. Available evidence indicates that current reliability figures are orders of 
magnitude less than required. 

For the short term the highest priority research recommendations were: 

1. formal specification 

2. software environment and tool development 

3. reliability prediction, estimates, and measurement 

4. fault-tolerant designs 

5. formal verification 

For the long term the highest priority research recommendations were: 

1. rel i abi 1 ity prediction, estimation, and measurement 

2. formal specification 

3. formal verification 

It was suggested that AIRLAB might serve as a repository for sample flight con- 
trol problems, support tools, and experimental results. Statistical ly controlled 
software development experiments using flight control problems as vehicles for coher- 
ency could be performed in AIRLAB. These experiments would permit measurement of the 
contributions that different software development methodologies make to reliability. 
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2.0 RELIABLE SOFTWARE DEVELOPMENT PROJECTS 


2 . 1 Producing Reliable Software for the Space Shuttle 

The Space Shuttle Software System is unique in that it uses software to perform 
crucial functions with no analog backup. The primary goals of the Shuttle software 
developing agency (IBM-FSD) were to produce software which meets the intent of custo- 
mer requirements, have the software perform in accordance with the customer's opera- 
tional expectations, and produce software which is free from errors. The use of a 
composite set of midseventies techniques comprised the software development process. 
Software reliability measures, formal specification languages, and formal verifi- 
cation methods were not used. 

Four system test facilities were used throughout the production process. These 
are: a) a software development laboratory, b) a software-hardware integration labor- 

atory, c) a flight systems laboratory, and d) a crew training laboratory. The soft- 
ware developers placed greater emphasis on the earlier part of the development cycle. 
An early definition of development tools, the use of structured methodologies, strict 
configuration control, and the extensive use of review boards for decision making 
constituted the development approach. The developers fostered an adversary relation- 
ship between the designers and verifiers by maintaining their organizational indepen- 
dence. A concise yet thorough description of the Shuttle software development is 
given in a paper by A.G. Macina [31. 

One of the difficulties encountered during the development of the Shuttle soft- 
ware was the need to overlay software programs in memory. The function and size of 
the applications software were not considered in the hardware selection decision. In 
retrospect, it seems that software size should weight this decision. 

The selection of the quad-redundant design also posed problems in that a 2 by 2 
split was possible. Considerable effort was allocated to assuring that this condi- 
tion did not occur. In the development of the SIFT and FTMP computers this problem 
was solved via the theory of interactive consistency [4]. 

Since reliability measures were not produced, the Shuttle developers have no 
measure of the reliability achieved. Brute force testing has increased their confi- 
dence but in an unquantified way. In addition, they are only minimally confident of 
correct operation in off-nominal flight operation. The software failed during simu- 
lation of an off-nominal situation 3 weeks prior to the second Shuttle launch. 

2 . 2 Controlling the Software Development Process - The SAGA Project 

SAGA, a syntax-directed management system for software production, is an on- 
going research project at the University of Illinois [5]. This effort is aimed at 
tackling the complexity of software development projects by providing an interactive 
system for formally describing software production management and controlling the 
mechanization of management policies. 

Context-free grammars, called management grammars, which describe the software 
development process have been proposed, and recognizers for such grammars are cur- 
rently being developed. In use, the recognizers will permit only approved activities 
by software project staff and will collect management data routinely and automatical- 
ly as the project proceeds. 
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The goals of the SAGA project represent an aggressive effort towards making the 
process of software development more visible. This project illuminates the need to 
use computers to control the development of computer programs. If successful, the 
project will enable management to make decisions based upon accurate up-to-date infor- 
mation and to better control the software development process. 

2.3 The Cleanroom Approach to Reliable Software Development 

The IBM Cleanroom Software Development Project constitutes a technical and organi 
zational approach to developing softv/are products with certifiable reliability. This 
approach divides software development into two parts; software design engineering and 
software product engineering. The design engineer creates the product and the prod- 
uct engineer certifies it. 

The methods employed by the design engineers include stepwise refinement, cor- 
rectness proving, finite state machine definitions, and the use of a design language. 

In the coding phase, the design engineers use high-level programming languages, 
structured programming techniques, and code reviews. They are trained to have the 
attitude that they can produce error-free software and are permitted to perform only 
syntax checks on their code. 

The product engineers essentially debug the software produced by the design 
engineers. The strategy used by the product engineers involves blind testing in 
which design details are hidden. This testing is accomplished by analyzing the input 
probability distributions, generating random inputs according to these distributions, 
and recording failure data. Mean-time-before-failure statistics are generated, and 
the product's reliability is estimated using Musa's execution time model [6]. 

Regression testing occurs as part of the failure diagnostic support. 

Two premises underlie the Cleanroom approach to developing highly reliable soft- 
ware. One premise is that individuals can be taught to write correct programs. 
Arguments which support and question this assumption can be constructed. By removing 
the crutch of testing, the designers will most probably be more conscientious in their 
code development and more apt to subject their code to extensive desk checking. On 
the other hand, are humans actually capable of consistently writing error-free code? 

The second premise is that randomized testing by itself is sufficient. Random- 
ized testing definitely avoids the 'fix the bug' and 'intended use' syndromes which 
designers are prone to exhibit during testing. On the other hand, path testing and 
tests which detect error types that occur most frequently are far from useless. 
Furthermore, a difficulty encountered in randomized testing is the inability to pre- 
dict the correct output. 

2.4 Preimplementation Phases of Software Development 

A range of approaches [7] exists for specifying the preimplementation informa- 
tion needed in the initial phases of software development. These specif ication 
approaches include both the informal traditional and information flow methods and the 
more formal state-based, expression-based, axiomatic, and temporal logic descriptions. 
Coupled with the choice of specification language are the types of ana^lysis that it 
is desirable to perform. Checking the consistency of the way the information is used 
represents an extant analytical technique. Rapid prototyping, simulation, and 
testing constitute viable yet infant approaches to ensuring the correctness of formal 
language specifications. Proofs methods may be useful for verifying specifications 
written in axiomatic description languages. 
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A clear delineation between the types of information recorded during each of the 
preimplementation phases of software development does not exist. For example, the 
distinction between requirements and design is often vague. The boundaries of and 
transition between each of these phases must be precisely defined. This definition 
is prerequisite to the selection of a description language. The choice of a descrip- 
tion language should depend upon the amount and type of information to be specified 
during each preimplementation phase. 

For flight control systems, a language suitable for describing the desired con- 
currency, real-time constraints, and response to exceptions is needed. Research 
on the types of analysis necessary to establish the completeness and correctness of 
flight control software requirements is also needed. 

2.5 Programming Languages 

Requirements for high-level programming languages include power of expression 
and reliability. ADA, EUCLID, and GYPSY are examples of high-level programming lan- 
guages developed for producing reliable software. Features of these languages in- 
clude strong data typing, the use of data and procedure abstractions, exception han- 
dling facilities, and the ability to express concurrency. Strong data typing permits 
the compile-time checking of the consistent use of variables. Data abstractions and 
procedure abstractions are mechanisms for selectively hiding objects and allowing 
partial access. The inclusion of precertified packaged routines and library facili- 
ties also enhances the reliability of programs written in these languages. 

Although they may offer the ability to write more reliable programs, these lan- 
guages do not guarantee the production of software having a reliability of the order 
necessary because many of the traditional sources of error remain possible. In addi- 
tion, their language implementations have not been certified as reliable. Proving 
the correctness of an entire implementation for a language like ADA is beyond the 
current state of the art. 

One of the difficulties underlying the formal verification of an entire language 
implementation stems from the lack of adequate methods for defining the semantics of 
programming languages, and the considerable body of support software needed in addi- 
tion to a compiler. Except for the control flow constructs, the run-time structure 
of a program differs from the static compile-time structure. Demonstrating that a 
program is ultra-reliable will require knowledge of its run-time structure and hence 
knowledge of the compiler's implementation (i.e. the language semantics). This 
demonstration is referred to as proof of security of implementation. Verification 
based on models of programs as they are executed requires additional research. 

Proofs of implementation that include all support software as well as the compiler 
also require extensive additional research [8,91. 

2.6 Software Testing 

One method for constructing test data sets which yield increased confidence in 
program correctness involves evaluating test data sets by introducing errors into a 
program P. This method is known as program mutation [10]. Program mutation consists 
of constructing a test data set, executing P with the test data, introducing errors 
into P to form a mutant P', executing P' using the original test data, and comparing 
the results to see if the test data distinguished P from P*. The number of program 
mutations constructed is reduced by assuming that the programmers are competent and 
will try to deliver a correct program. For example, mutating a program by deleting 
it entirely is a valid mutation but is clearly pointless. 
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Metrics can be calculated for various test data sets. One possible metric is 
the percent of nonequivalent mutants of P which were distinguished by the test data. 
The usefulness of this metric lies in its ease of computation. 

This methodology evaluates the effectiveness of test data sets in detecting var- 
ious types of simple errors. Program failures resulting from incomplete specifica- 
tions or missed requirements are excluded. Some data exist which indicate that the 
majority of complex errors are comprised of combinations of simple errors. This 
phenomenon implies that attaining a high degree of test coverage for simple errors 
will provide some coverage of the more complex errors. Further characterization of 
the error space is needed. 

The mutation method of program testing represents one approach to evaluating 
test data sets. Other approaches exist in the literature [11]. A study which evalu- 
ates the efficiency of the various testing approaches constitutes a valid research 
need. 


2.7 Software Fault Tolerance 

Software fault tolerance methods are methods for developing software which is 
tolerant of software faults [12]. A software fault is defined as a design defect in 
the software, where the term "design defect" encompasses all deficiencies introduced 
throughout the software development process. Manifestation of a software fault 
places the system in an erroneous state, which may lead to system failure. Recovery 
blocks, n-version programming, and robust data structures are fault-tolerant mechan- 
isms advocated in the literature today. 

Software fault tolerance methods are needed because fault avoidance and fault 
removal methods alone are inadequate for achieving the required level of reliability. 
Implemented in unison, fault tolerance, avoidance, and removal represent a balanced 
approach to producing highly reliable software. 

Experiments which assess the contributions to reliability of the various soft- 
ware fault tolerance approaches are needed. 

2.8 Software for Flight Control Applications 

Producing flight control systems whose reliability is demonstrable requires much 
effort and expense. The difficulties encountered lie more with the software than the 
hardware since a suitable framework does not yet exist for quantitatively assessing 
software reliability. Generally speaking, software reliability is discontinuous. 
Software failures occur as results of random encounters with design faults rather 
than results of continuous degradation or wearing out. 

The successful handling of software errors involves minimizing the likelihood of 
error introduction, improving the effectiveness of methods for detecting hidden de- 
sign faults, and a priori code stabilization. The use of constructive software 
development methodologies helps minimize the number of errors introduced. Tiger team 
inspection represents a suitable means for detecting latent faults. A tiger team is 
a sophisticated group of individuals who function in a constructive yet adversary 
manner. Crucial software (e.g. the executive) may be stabilized by extensive use in 
noncritical applications. Stabilized code could then be placed in libraries for 
multi application use.. 
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Since software failures frequently cause the system to exhibit aberrant and 
discontinuous behavior, it may be advantageous to invoke a procedure which results in 
a seemingly continuous system recovery. This recovery may be accomplished by 
reinitializing the system to a previously correct state. Note that the effects of 
repeated invocation of a reinitialization procedure can be observed easily [13]. 

2 . 9 Software E n vironments - The TOOLPACK Project 

A software environment provides programmers with an integrated set of tools 
which assist them in creating software [14]. The TOOLPACK project is an ongoing en- 
deavor to establish the appropriate environment for FORTRAN programmers who write 
small- to medium-sized mathematical programs. 

Encouraging programmers to experience the tools and provide feedback is a basic 
tenet of the TOOLPACK project. To initiate a feedback loop, the project leaders have 
designed small experiments aimed specifically at identifying the proper tools, infor- 
mation base, and user interface. Current components of this integrated system are 
tools which support code development, maintenance, testing, analysis, documentation, 
and portability. A variety of institutions which develop mathematical software are 
participating in this project. 

The data files are an essential component of the TOOLPACK system. Since the 
data files are implemented by the host's file system, portability is enhanced by 
utilizing a very simple and commonly occurring interface. It is difficult to deter- 
mine a priori what the organization and contents of these files should be. An intel- 
ligent guess must be made. Once knowledge of usage patterns increases, an accordant 
information base will be constructed. 

Gaining acceptance by the user community represents a major obstacle for 
TOOLPACK. The incremental development approach, the flexibility provisions (i.e. no 
predefined order of tool uses), and an intelligent editing facility should contribute 
to its acceptance. A remaining critical problem, however, is the response time re- 
quired. An obvious solution might be an overnight run which establishes the initial 
information base. 


2.10 Static Analysis of Concurrency 

A tool which eventually will be incorporated into an integrated environment is 
being developed to statically analyze concurrency in ADA programs. The specific 
analytical capabilities being developed document rendezvous, parallel actions of in- 
terest, and potential deadlock or infinite wait situations. This concurrency analyz- 
er requires the user to have some knowledge of the underlying analysis. In particu- 
lar, the user must be capable of resolving problems which surface during the analy- 
sis. 


For languages which permit rendezvous, analyzing concurrency and detecting er- 
roneous conditions is np-complete. The computation time required for detecting 
erroneous situations grows exponentially with n, the number of concurrent tasks being 
statically analyzed. This analysis is manageable for n ^ 5. 

Heuristics may be developed to allow analysis of larger systems of tasks. How- 
ever, work is proceeding on an algorithm to reduce the complexity of systems involv- 
ing larger numbers of tasks by partitioning the set and analyzing each set indepen- 
dently [15,16]. 
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2.11 Software Reliability Measurement 


An important issue in crucial software development is the quantification of its 
reliability. Problems encountered in assessing the reliability of flight software 
may prove more difficult than those encountered during attempts to produce it. 

Reviews of software reliabil ity models can be found in the literature [17,18,19]. 
Assuming that these models are applicable to the systems of interest, they will still 
require a very large amount of execution time in order to estimate reliability when 
it is of the order required. 

A potential solution is to develop a model which investigates the internal 
structure of the software. In developing such a model, reliability theorems for 
hardware designs may prove useful. A theorem from hardware reliability states that 
redundancy at the component level results in a more reliable system than redundancy 
at the system level. This theorem may be useful for assessing the contributions that 
the various fault-tolerant methods make to reliability. It remains to be investigat- 
ed whether or not this theorem holds true for software. 

The modeling of software reliability is extremely important to the development 
of acceptable crucial systems. It is not clear, however, whether the existing models 
of reliability are appropriate to digital flight control system software. In addi- 
tion, these models have not been extensively validated by experiment. A great deal 
of research is needed before believable reliability figures can be associated with 
software. 


2.12 Formal Verification of SIFT 

Formally verifying the design of SIFT entailed the specification of a hierarchy 
of models [20]. The highest level model within this hierarchy is the I/O model. 

This model succinctly describes the required properties of the system. To gain assur- 
ance in the correctness of the SIFT design, the policy maker needs to understand only 
the description of the axioms of this top level model. In SIFT, there are six such 
axiomatic statements. The lowest level model in the SIFT hierarchy describes the 
program executed by the hardware. 

The models of the system are specified using different languages. Axioms in one 
model are mapped to axioms specified in the next level model. Given these mappings, 
verification of the design of SIFT involves showing that each axiom specified in a 
higher level model is provable as a theorem of a lower level model. This methodology 
alleviates the dangers of inconsistency by providing assurance that one axiom is 
derivable from the next. 

The proof of correctness of the SIFT software is a little over 100 pages. This 
does not include the lowest of the six levels. The majority of difficulties in the 
proof technique have been eliminated and it is anticipated that an intelligent comp- 
uter science graduate could prove the software correct in about 6 months. 

Essential to using a semiautomated theorem prover is the creation of a symbiot- 
ic relationship between man and machine. The human understands the proof and can use 
intuition to choose between the numerous paths which could be taken. The computer 
system is useful for simplification and provides the required bookkeeping services. 
Critical aspects of this symbiosis are short response times between the steps of the 
proof and adequate information display facilities. If the response time is too long, 
the human is forced to specify greater detail and take smaller steps in directing the 
system. Proving SIFT required approximately 18 megabits of store. Approximately 



500 lemmas were created, which posed a considerable bookkeeping problem. Part of the 
problem was the difficulty of maintaining meaningful lemmas on a single screen. 

The construction of the SIFT design proof was pedagogical. It demonstrated the 
feasibility of design proofs and highlighted the importance of the man-machine symbi- 
osis- It also reinforced the need for simplicity. Violation of simplicity may pre- 
sent undue restrictions on what can be assuredly demonstrated about a system. 

Further work which will enhance the formal verification process is needed. 

2.13 System Specification and Program Transformation 

A trend in reliable software development involves the definition of new ways in 
which support software can be used to effectively el iminate sources of error. Dis- 
crepancies between the user's intent and the actual system specification constitute a 
major source of errors. Transitions between steps in the system life cycle are 
another potential source of errors. Software which assists in the formulation of 
system specifications and semiautomates program transformations should enhance the 
reliability of a system by eliminating these error sources. Developing this support 
software is not an easy feat. 

Specification Acquisition from Experts (SAFE) is a prototype tool which simpli- 
fies the creation of a formal specification [21]. The development of this tool is 
highly desirable as it should increase the reliability of the specification process 
and does not require specialized training. It would also make the formal specifica- 
tion more maintainable, since the informal specification can be modified and semi- 
automatical ly retransformed into the formal specification. Making the SAFE system 
interactive helps eliminate the problem of computer mi sinterpretation of the informal 
specification. It has been demonstrated that this interaction can be kept to a mini- 
mum so as not to abrogate the advantages of informality. Part of this project 
involved the development of a suitable formal specification languaqe. 

Developing computer-based tools which support the user during the development of 
a program by mechanically transforming formal specifications into efficient implemen- 
tations should improve reliability. If the transformation correctly preserves seman- 
tics, as intended, new errors cannot be introduced. A prototype program transforma- 
tion tool is currently under development at USC-ISI [22]. This tool transforms speci- 
fications using a methodology similar to that used for verifying SIFT. The main dif- 
ference is that program optimization and maintainability rather than verification are 
of concern. Maintainability is improved since changes and enhancements are effected 
by modifying the specification and allowing the computer to redo the transformations 
which resulted in the original optimized implementation. Since optimizing a program 
contributes to its complexity and hence its reliability, this automated approach 
seems superior to conventional maintenance. 

These tools require additional development before they will be useful for pro- 
duction. Since the man-machine interface is extremely important, some trial use and 
feedback as with the TOOLPACK project will be necessary. Note that these types of 
automation would allow one person to create a software system. The problem has 
not been tackled for permitting development by a group of people. An effective com- 
munication mechanism would have to be developed in this case. 
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3.0 CONCLUSIONS 


The overriding group consensus was that the currently stated reliability 
requirements for software alone cannot be achieved or confirmed with current technol- 
ogy. Available evidence indicates that current reliability figures are orders of 
magnitude less than required. 

A concern which was voiced repeatedly by working group participants was the need 
for improved methods of defining requirements and specifications. Experience has 
shown that inconsistent or inadequate requirements definitions are a constant source 
of errors which are exceedingly difficult to find. 

In the short term the best overall approach to software development is to employ 
an eclectic set of complementary techniques. The integration of fault avoidance, 
fault tolerance, fault removal, and other software engineering methods should yield a 
substantial improvement in overall reliability. Component technology at all levels 
of software development is also recommended as the components may be separately vali- 
dated and reused. 

In the long term, formal modeling and definition methods should prove invalu- 
able at all levels of software development for the projected reliability require- 
ments. Techniques which investigate program structure should prove immensely useful. 
Demonstrating absolute equivalence between a specification and an implementation may 
be technically attainable. There will remain, however, the difficulty of achieving 
the absolute equivalence of the: specifications with the intent. 

The following is a list of the general comments agreed upon by the majority of 
the meeting participants during the discussion period held at the latter part of the 
second day. They are not prioritized. 

Formal specifications are a critical aspect of super reliable software and pre- 
suppose a mechanism for determining the equivalent of the specification with the 
intent. We do not have technology for addressing that problem yet. 

The currently stated reliability requirements for software alone cannot be con- 
firmed with current technology. All available evidence indicates that we are 
currently several orders of magnitude short of the stated figure in general. It 
is unlikely that we can achieve ultra reliability by incremental improvements in 
reliability. 

There is serious doubt that it is presently possible to produce flight software 
systems having the stated level of reliability and to assure that they have that 
level of rel i abi 1 i ty. 

We do not have a measure of the level of reliability that can be assured by a 
methodology, or the ability to compare the levels assured by different methodol- 
ogies. 

The reliability of any flight software depends on the reliability of the con- 
siderable body of support software (tools, language processors, etc.). 

Within the forseeable future it will not be possible to define hiqhly reliable 
requirements of an arbitrarily complex system. We must learn to limit the com- 
plexity of systems or at least of those parts that must be reliable. 



"In the short term, the best approach is to be eclectic. We recommend an inte- 
gration of fault avoidance, fault tolerance, etc. and we expect a substantial 
improvement. We don't know what the right mix is!" 

"In the longer term, absolute equivalence between a specification and an imple- 
mentation may be technically attainable. 

"Formal modeling and definition methods are invaluable at all levels of soft- 
ware development for the projected reliability requirements. Make them compre- 
hensible, concise and intellectually manageable by mere mortals." 

"We recommend component technology at all levels of development. These can be 
separately validated and reused." 

"The contributions which the various software engineering approaches make to 
reliability need to be quantitatively determined." 

"We are dismayed that in the area of hardware reliability little or no attention 
is given to modeling and analyzing design faults. These faults are similar to 
software faults, and are the source of most system unreliability ." 

"Do what we already know in real applications." 

"The internal structure of the software cannot be ignored." 

Due to the scope and inherent complexity of the problem being addressed, a pri- 
oritization of research needs was requested. This prioritization was accomplished by 
a vote in which each meeting participant ranked three short-term and three long-term 
research needs. Table 1 summarizes the results of this vote and shows formal speci- 
fication, software environments/ tools, reliability modeling, fault-tolerant de- 
signs, and formal verification as the foremost short-term research needs. Research 
in reliability modeling, formal specification, and formal verification is indicated 
for the long term. These recommendations are listed by participant in the appendix. 

Recommendations for research included suggestions that AIRLAB be a repository 
for sample flight control problems of various sizes. These problems would be useful 
for quantitatively evaluating the various approaches to reliable software production. 
This evaluation could take the form of statistically controlled software development 
experiments. These experiments would involve the fabrication of software solutions 
for real applications and require all activities from the informal problem statement 
to the highly reliable software product. Thus, these experiments would be termed 
"end-to-end." An experimenter might deliver a prepackaged system component and the 
overall development process could be evaluated within AIRLAB. Thus, AIRLAB could 
serve as a repository of sample problems, system development tools, and experimental 
results. It could be a place where comparative and competitive studies are performed 
as well as a focal point for additional workshops. 

The following is a list of AIRLAB themes recommended by the meeting partici- 
pants : 

• Repository of sample avionics problems, various sizes 

• Tool and result repository 

• University experiments in all areas 
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• Coordination and integration of results 
•Comparative, competitive studies 

• Additional workshops 

• Collection of statistically meaningful data 
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APPENDIX 

PRIORITIZATION OF RECOMMENDED RESEARCH ACTIVITIES 
PARTICIPANT - T. ANDERSON 
Short Term 

1. Reliability measurements & requirements/specification (tie) 

2. Software fault tolerance 

3. Real-time issues 

Long Term 

1. Validation 

2. More requirements 

3. Development tools 


PARTICIPANT - R. CAMPBELL 
Short Term 


1. Reliability - how to measure - actual ways to measure 

2. Formal verification and validation of complete software life cycle including 
requirements, maintenance and testing 

3- Tools to aid in measuring reliability and formal verification and validation 
of complete software life cycle including requirements, maintenance and testing 

Long Term 


1. Actual ways to measure reliability 

2. Formal verification and validation of complete software life cycle including 
requirements, maintenance and testing plus much more integration to form a 
"product" 

3. Tools to aid in measuring reliability and formal verification and validation of 
complete software life cycle including requirements, maintenance and testing plus 
much more integration to form a "product" 

PARTICIPANT - F. DONAGHE 
Short Term 


1. Specifications 
2- Implementation 


3. Verification 



Long Term 

1. Specifications 

2. Implementation 

3. Verification 

PARTICIPANT - M. DYER 
Short Term 

1. SIFT type verification 

2. Fault tolerance (see Anderson) 

3. Rel i abi 1 ity measures 

Long Term 

1. Specification methods 

2. Spanning specifications to implementation 
3- Packaging for components 

PARTICIPANT - B. LITTLEWOOD 
Short Term 

1. Stochastic reliability modeling of software fault-tolerant systems 

2. Comparison of performance of existing (and future ?) software rel iabil ity models 
on real data sets 

3. Requirements/specifications fault tolerance 

Long Term 

1. Relationship between other metrics (and quality of them) and software 
rel i abi 1 i ty 

2. Comparison of subjective beliefs and actual performance; consensus techniques 
between "expert" witnesses 

PARTICIPANT - M. MELLIAR-SMITH 

Short Term 

1. Formal verification 

2. Formal requirements 

3. Testing 
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Long Term 


1. Formal verification 

2 . Formal requirements 

3. Software reliability measurement 

PARTICIPANT - H. MILLS 
Short Term 

1. AIRLAB environment for end-to-end model projects 

2. Techniques for formal and readable flight software specifications 

3. Technical standards for high reliability software development 

Long Term 

1. Specifications of module (package) library for flight software reuse 

2. Automatic programming methods peculiar to flight software 

3. Relation of catastrophic theory to to software requirements and specifications 

PARTICIPANT - T. PRATT 
Short Term 

1. Formal methods for specifying and verifying the correctness and consistency of 
precoding stages of software development requirements, specifications, design 

2. Integrated software development environments 

- management tools and software 

- construction/analysis tools 

3. Testing methods and quantifying the reliability of software after testing 

Long Term 

1. Formal methods for specifying and verifying the correctness and consistency of 
precoding stages of software development requirements, specifications, design 

2. Integrated software development environments 

- management tools and software 

- construction/analysis tools 

3. Testing methods and quantifying the reliability of software after testing 
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PARTICIPANT - R. TAYLOR 


Short Term 

1. A tool environment incorporating the best available technology of 

Requirements analysis 
Prel imi nary design 
Detailed design 
Coding tools 

Verification and test tools 
(NTS + preimplementation) 

2. Software fault tolerance techniques 

3. Software management issues 

Long Term 

1. Preimplementation tools, requirements foremost 

2. Program transformation tools 

3. Formal verification & Reliability assessment (tie) 

PARTICIPANT - J. WILEDEN 
Short Term 

1. Specifications/requirements tools, especially assessment/animation, for flight 
control software 

2. Contribution of software fault-tolerance to flight control software reliability 

3. Contribution of testing to flight control software reliability 

Long Term 

1. Apropriate methods for measuring reliability and appropriate goals 

2. Formal verification 

3. Transformation (computer-assisted) implementation from specifications 
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